BSure 0.0.0.9000
BSure is a method for the robust inference of gene essentiality types for pooled CRISPR survival screens and for the assessment of screen quality and design of screens. It uses Bayesian methods to model explicitly measurement noise and levels of replicability. We obtain a set of possible values for the essentiality score, each of them with a probability of it being the real value, that is a posterior probability distribution of scores. This is illustrated in the figure below. The credible interval is an interval that contains the unknown true value of the essentiality with a probability of 95%. The higher the screen quality, replicability of the experiment and agreement of gRNAs, the narrower the credible interval, that is we learn more about the essentiality from a higher quality screen and can narrow down the range in which the true essentiality type must be contained. For details see Strauss and Parts, 2020. In preparation, and the examples and illustrations in this vignette.
Range of essentiality types: posterior distribution and credible interval
library(BSure)
To illustrate the method, we first load a subset from the transcriptome-wide screen in the HT29 cancer cell line published in Allen et al. (2019). The reduced dataset contains a mix of essential and nonessential genes.
data(counts_HT29_small)
We compute log-fold changes between counts from the knockout screens and the plasmid library.
colnames(HT29_small)
## [1] "sgRNA" "gene" "HT29_C902R1_P1D14"
## [4] "HT29_C902R2_P1D14" "HT29_C902R3_P1D14" "HT29_c903R1"
## [7] "HT29_c903R2" "HT29_c903R3" "HT29_c903R4"
## [10] "HT29_c903R5" "HT29_c903R6" "HT29_c904R1"
## [13] "HT29_c904R2" "HT29_c904R3" "HT29_controls"
#The last column is the control measurement. The first column contains the gRNAs
#and the second the names of the corresponding genes.
nc <- ncol(HT29_small)
counts <- HT29_small[,-c(1:2,nc)]#removing columns of gRNAs, genes and controls
controls <- HT29_small[,nc]
lfc_small <- compute_lfc_from_counts(counts=counts, controls=controls)
HT29_lfc_small <- cbind(HT29_small$sgRNA,HT29_small$gene,lfc_small)#required format
#for BSure algorithm:
#first column: gRNAs, second column: corresponding genes, third to last column:
#log-fold changes for all replicates
Now we can run the BSure algorithm. For the small dataset, we can run it on one compute core.
runBSure(lfc=HT29_lfc_small,save_file_name="HT29_lfc_small",min_tail_ESS=500)
and analyse the output.
load("HT29_lfc_small.rda")#save_file_name from above plus ".rda"
extracted_output_HT29 <- extract_from_output(output)
plot_credible_intervals(extracted_output_HT29,"HT29_small")
## quartz_off_screen
## 2
This created a folder named HT29_small, containing the following plots:
HT29_small_credible_intervals_as_function_of_mean.pdf
In this figure the credible intervals for the essentiality of all genes, defined by their upper bounds (brown line) and lower bounds (black line) were plotted along the y-axis and the mean estimates of essentiality for the corresponding genes were plotted along the x-axis. The smaller the credible intervals, the more precisely we know the true essentiality type of the gene. As we see in this figure, credible intervals tend to be narrower for nonessential genes (on the right, with mean essentiality types around 0) than for more essential genes (on the left, strongly negative mean essentiality types.) Narrow credible intervals are indicative of a reliable screen of good quality which allows robust assessment of essentiality types for all genes. This is an example of a good screen with acceptable credible intervals.
HT29_small_density_length_of_credible_interval.pdf
This figure illustrates the distribution of the lengths of the credible intervals across all genes.
|
|
|
HT29_small_Rhat_credible_interval.pdf and HT29_small_tail_ESS_credible_interval.pdf
The two figures above are technical figures to check if there were any problems with the algorithm running on the dataset and inferring reliable credible intervals. Problems would be indicated by correlations apparent from the figures above. There are no problems apparent in this case.
extracted_output_HT29 has the following output
sample summary: a data frame containing mean estimates of the essentiality (means), credible intervals (CI_lower_bounds for the lower bounds of the credible intervals, CI_upper_bounds for the upper bounds), prob_essential_I: probability of being essential of type I (see below for illustration), prob_essential_II: probability of being essential of type II (higher level for essentiality, see below)
prop_expensive_sampling and pro_very_expensive_sampling: a technical measure of the percentage of genes for which more expensive sampling methods were required.
Essentiality of type II is determined by comparing the posterior distribution of the gene to that of known core essential genes (Hart et al. (2017)) and core fitness genes (F. M. Behan et al. (2019)), for details see Strauss and Parts (2020), for illustrations see the examples further below.
To illustrate, essentiality types I and II, we now run BSure on a small subset of genes (vector_of_genes=genes_HT29_small[seq(1,length(genes_HT29_small),10)]), this time plotting the posterior distributions by setting the variable plot_folder_name to the name of a subdirectory where the plots will be saved.
genes_HT29_small <- unique(HT29_lfc_small[,2])
runBSure(lfc=HT29_lfc_small,save_file_name="HT29_lfc_small_sub",n_cores=1,min_tail_ESS=500,vector_of_genes=genes_HT29_small[seq(1,length(genes_HT29_small),10)],plot_folder_name="HT29_small_gene_distributions")
An example of a highly essential gene. It is essential of type II with probability 1, as its posterior distribution is shifted by less than 1/3 compared to the distribution of core essential genes.
This gene is essential of type I because of the distinctness of its distribution from that of nonessential genes. Its evidence for being essential of type II is lower, as only part of its distribution is shifted less than 1/3 to the right compared to the distribution of core essential genes.
For genome-wide screens we recommend running BSure in parallel on 5-25 cores. The following example applies the method to the full genome-wide HT29 screen, from which the small dataset above was an extract. The code below runs BSure and plots credible intervals and diagnostic figures as illustrated above for the small example dataset. As running this function on a genome-wide screen is computationally more expensive and because of current problems concerning the parallel package if used with RStudio in connection with R-4, the following code cannot be run from RStudio with R-4.0.0-R-4.0.2.
library(BSure)
nCores <- 25
data("counts_HT29")
dir.create("results_HT29")
nc <- ncol(counts_HT29)
lfc <- cbind(counts_HT29[,1:2],compute_lfc_from_counts(counts_HT29[,-c(1:2,nc)],counts_HT29[,nc]))
save_file_name <- "results_HT29/HT29_full"
runBSure(lfc,save_file_name,nCores,2000)
load(paste(save_file_name,".rda",sep=""))
extracted_output <- extract_from_output(output)
save(extracted_output,file=paste0(save_file_name,"_extracted_output.rda"))
plot_credible_intervals(extracted_output,"results_HT29")
The figures below compare for the screen analysed the following sets of genes: A) genes on a list of core essential (Hart et al. (2017)) and core fitness genes (F. M. Behan et al. (2019)). B) genes on a list of nonessential genes (Hart et al. (2014)), C) genes identified as essential of type I by the BSure method, D) genes identified as essential of type II by the BSure method. The following are characteristics of a screen of good power to distinguish between genes of different levels of essentiality. The main purpose of the scores below is across-screen comparison of screen quality and reliability. The suggested thresholds are therefore not hard cutoffs. In particular, if stricter FDR thresholds are used, then all of the four scores will be lower. Therefore, we recommendusing the default FDR of 0.05.
data("HT29_full_extracted_output")
extracted_gene_categories_HT29 <- extract_gene_categories(HT29_full_extracted_output,figures = T)
extracted_gene_categories_HT29$proportion_core_essential_in_essential_II
## [1] 0.6581921
extracted_gene_categories_HT29$proportion_core_essential_in_essential_I
## [1] 1
extracted_gene_categories_HT29$proportion_non_essential_genes_in_essential_II
## [1] 0.001340483
extracted_gene_categories_HT29$proportion_non_essential_genes_in_essential_I
## [1] 0.1206434
extracted_gene_categories_HT29$score
## [1] 0.690647
We load the processed output for screens on the K562 cell (Peets et al. (2019)) line with different Cas9 activity levels and different experiment coverage.
data("resultsK562_highest_extracted_output") #screens with highest experiment coverage levels (160x-180x), lower Cas9 activity, 3' readout
data("resultsK562_high_extracted_output") #120x-130x experiment coverage levels, lower Cas9 activity, 3' readout
data("resultsK562_lowest_extracted_output") #25x-30x experiment coverage levels, lower Cas9 activity, 3' readout
data("resultsK562_highCas9_extracted_output")#higher Cas9 activity, 134x coverage, 3' and 5' readout
data("resultsK562_highest_3_5_extracted_output")
First, we look at the gold standard screen
extracted_gene_categories_K562_best <- extract_gene_categories(resultsK562_highCas9_extracted_output)
extracted_gene_categories_K562_best$proportion_core_essential_in_essential_II
## [1] 0.8107345
extracted_gene_categories_K562_best$proportion_core_essential_in_essential_I
## [1] 0.9237288
extracted_gene_categories_K562_best$proportion_non_essential_genes_in_essential_II
## [1] 0.00886918
extracted_gene_categories_K562_best$proportion_non_essential_genes_in_essential_I
## [1] 0.04767184
extracted_gene_categories_K562_best$score
## [1] 0.8177175
Now we check how well genes of different essentiality types are retained with the lower Cas9 levels 3' and 5' readout (but highest experiment coverage) - we see that all scores are slightly lower.
extracted_gene_categories_K562_highest_3_5<- extract_gene_categories(resultsK562_highest_3_5_extracted_output)
extracted_gene_categories_K562_highest_3_5$proportion_core_essential_in_essential_II
## [1] 0.6920904
extracted_gene_categories_K562_highest_3_5$proportion_core_essential_in_essential_I
## [1] 0.8813559
extracted_gene_categories_K562_highest_3_5$proportion_non_essential_genes_in_essential_II
## [1] 0.01108647
extracted_gene_categories_K562_highest_3_5$proportion_non_essential_genes_in_essential_I
## [1] 0.04545455
extracted_gene_categories_K562_highest_3_5$score
## [1] 0.7355563
Next we check how well genes of different essentiality types are retained when only 3' readouts are used (but highest experiment coverage) - we see that all scores are slightly lower. Again there is some reduction in the proportion of genes recovered and the final score.
extracted_gene_categories_K562_highest<- extract_gene_categories(resultsK562_highest_extracted_output)
extracted_gene_categories_K562_highest$proportion_core_essential_in_essential_II
## [1] 0.6553672
extracted_gene_categories_K562_highest$proportion_core_essential_in_essential_I
## [1] 0.8474576
extracted_gene_categories_K562_highest$proportion_non_essential_genes_in_essential_II
## [1] 0.009977827
extracted_gene_categories_K562_highest$proportion_non_essential_genes_in_essential_I
## [1] 0.05875831
extracted_gene_categories_K562_highest$score
## [1] 0.6864906
Finally we check how well genes of different essentiality types are retained with the the lowest coverage level - now the power of the screen has decreased dramatically.
extracted_gene_categories_K562_lowest<- extract_gene_categories(resultsK562_lowest_extracted_output)
extracted_gene_categories_K562_lowest$proportion_core_essential_in_essential_II
## [1] 0.5338983
extracted_gene_categories_K562_lowest$proportion_core_essential_in_essential_I
## [1] 0.3163842
extracted_gene_categories_K562_lowest$proportion_non_essential_genes_in_essential_II
## [1] 0.06762749
extracted_gene_categories_K562_lowest$proportion_non_essential_genes_in_essential_I
## [1] 0.01995565
extracted_gene_categories_K562_lowest$score
## [1] 0.3910395
Now we check directly how well genes of essentiality of types I and II found in the gold standard data set are retained with the lower Cas9 levels and only 3' readout (but highest experiment coverage):
We can directly compare the sets of essentiality types I and II found from the two screens:
comparison_Cas9_K562 <- compare_to_gold_standard(resultsK562_highCas9_extracted_output,resultsK562_highest_extracted_output)
comparison_Cas9_K562
## $proportion_genes_recovered_II_FNR
## [1] 0.9193422
##
## $proportion_genes_recovered_I_FNR
## [1] 0.9995626
##
## $proportion_genes_recovered_II
## [1] 0.550509
##
## $proportion_genes_recovered_I
## [1] 0.6137358
proportion_genes_recovered_I/proportion_genes_recovered_II is the proportion of essential genes of types I and II from the gold standard set that are also contained in the corresponding set for the new screen. The closer to one, the better the two screens agree in identifying genes at different essentiality types. A number close to 1 shows that the new screen has the same power as the gold standard to detect essential genes. proportion_genes_recovered_I_FNR/proportion_genes_recovered_II_FNR is the proportion of essential genes of types I and II from the gold standard set that are also contained in the corresponding set for the new screen, if the new screen is controlled for FNR (false negative rate) instead of FDR. This number must be close to one, as otherwise this shows a misclassification problem that essential genes are likely significantly not essential in the new screen.
Here, the new screen is of less power, but the above misclassification problem is not substantial.
Now we check how well genes of different essentiality types are retained with the lower Cas9 levels if we use both the 3' and 5' readout - there is a minor improvement.
comparison_Cas9_K562_B <- compare_to_gold_standard(resultsK562_highCas9_extracted_output,resultsK562_highest_3_5_extracted_output)
comparison_Cas9_K562_B
## $proportion_genes_recovered_II_FNR
## [1] 0.9201253
##
## $proportion_genes_recovered_I_FNR
## [1] 1
##
## $proportion_genes_recovered_II
## [1] 0.5810493
##
## $proportion_genes_recovered_I
## [1] 0.644357
This shows that by including both 3' and 5' readout, the genes set obtained are only slightly closer to the ones obtained by using the data with the higher Cas9 levels (and both 3' and 5' readout).
For screens of lower Cas9 levels, 3' readout, and different levels of experiment coverage we compare the lengths of the credible intervals to find the optimal coverage level for the screen. Smaller credible intervals correspond to a screen with more power to separate essential and nonessential genes, therefore we want to find the level of coverage from which onward increases in coverage do not decrease the credible intervals.
comparison_highest_lowest <- compare_length_of_credible_intervals(resultsK562_highest_extracted_output,resultsK562_lowest_extracted_output, "160x-180x","25x-30x")
comparison_highest_high <- compare_length_of_credible_intervals(resultsK562_highest_extracted_output,resultsK562_high_extracted_output, "160x-180x","120x-130x")
The BSure package also includes a method to test very robustly for genes that are essential of type II in one cell line, but not in a different cell line. The robustness is achieved by using stringent FDR control for the first cell line and stringent FNR control for the second cell line. The method can only be used if the
find_differential_genes(resultsK562_lowest_extracted_output,HT29_full_extracted_output)
## [1] "Less than 75% of genes essential of type II in screen one at the given FDR are essential of type II in screen 2 at the given FNR. This indicates a power imbalance across the screens to detect essential genes, and therefore this method to detect differential essentiality cannot be used."
## NULL
find_differential_genes(HT29_full_extracted_output,resultsK562_highCas9_extracted_output)
## [1] "NUP133" "SF3B14" "POLA2" "TWISTNB" "DDX20" "HIST1H2BC"
## [7] "KRTAP4-11" "EIF1" "RPL36A" "TTC1" "TRIM37" "KIAA1432"
## [13] "LAS1L" "HIST1H2AL" "MAD2L1" "CSNK1A1" "RPS5" "INTS6"
## [19] "FAM86B1" "NDE1" "HIST1H2BF" "LCE1E"
find_differential_genes(resultsK562_highCas9_extracted_output,HT29_full_extracted_output)
## [1] "Less than 75% of genes essential of type II in screen one at the given FDR are essential of type II in screen 2 at the given FNR. This indicates a power imbalance across the screens to detect essential genes, and therefore this method to detect differential essentiality cannot be used."
## NULL
The figures above suggest 120x-130x is the optimal coverage, as credible intervals are already comparable to those of the 160x-180x screen.
sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] BSure_0.0.0.9000 knitr_1.30 BiocStyle_2.16.1
##
## loaded via a namespace (and not attached):
## [1] tidyr_1.1.2 jsonlite_1.7.1 carData_3.0-4
## [4] RcppParallel_5.0.2 StanHeaders_2.21.0-6 assertthat_0.2.1
## [7] BiocManager_1.30.10 stats4_4.0.2 cellranger_1.1.0
## [10] yaml_2.2.1 pillar_1.4.6 backports_1.1.10
## [13] glue_1.4.2 digest_0.6.25 polyclip_1.10-0
## [16] ggsignif_0.6.0 colorspace_1.4-1 eulerr_6.1.0
## [19] htmltools_0.5.0 pkgconfig_2.0.3 rstan_2.21.2
## [22] broom_0.7.1 haven_2.3.1 bookdown_0.20
## [25] purrr_0.3.4 scales_1.1.1 processx_3.4.4
## [28] openxlsx_4.2.2 rootSolve_1.8.2.1 rio_0.5.16
## [31] tibble_3.0.3 generics_0.0.2 farver_2.0.3
## [34] car_3.0-10 ggplot2_3.3.2 ellipsis_0.3.1
## [37] ggpubr_0.4.0 withr_2.3.0 cli_2.0.2
## [40] magrittr_1.5 crayon_1.3.4 readxl_1.3.1
## [43] evaluate_0.14 ps_1.3.4 fansi_0.4.1
## [46] MASS_7.3-51.6 rstatix_0.6.0 forcats_0.5.0
## [49] foreign_0.8-80 pkgbuild_1.1.0 ggthemes_4.2.0
## [52] tools_4.0.2 loo_2.3.1 data.table_1.13.0
## [55] prettyunits_1.1.1 hms_0.5.3 lifecycle_0.2.0
## [58] matrixStats_0.57.0 stringr_1.4.0 V8_3.2.0
## [61] munsell_0.5.0 zip_2.1.1 callr_3.4.4
## [64] compiler_4.0.2 rlang_0.4.7 grid_4.0.2
## [67] labeling_0.3 rmarkdown_2.4 gtable_0.3.0
## [70] codetools_0.2-16 inline_0.3.16 abind_1.4-5
## [73] curl_4.3 R6_2.4.1 gridExtra_2.3
## [76] dplyr_1.0.2 polylabelr_0.2.0 stringi_1.5.3
## [79] parallel_4.0.2 Rcpp_1.0.5 vctrs_0.3.4
## [82] tidyselect_1.1.0 xfun_0.18
Allen, Felicity, Fiona Behan, Anton Khodak, Francesco Iorio, Kosuke Yusa, Mathew Garnett, and Leopold Parts. 2019. “JACKS: joint analysis of CRISPR/Cas9 knockout screens.” Genome Research 29 (3). Cold Spring Harbor Laboratory Press: 464–71. doi:10.1101/gr.238923.118.
Behan, Fiona M., Francesco Iorio, Gabriele Picco, Emanuel Gonçalves, Charlotte M. Beaver, Giorgia Migliardi, Rita Santos, et al. 2019. “Prioritization of cancer therapeutic targets using CRISPR–Cas9 screens.” Nature 568 (7753). Nature Publishing Group: 511–16. doi:10.1038/s41586-019-1103-9.
Hart, Traver, Kevin R Brown, Fabrice Sircoulomb, Robert Rottapel, and Jason Moffat. 2014. “Measuring error rates in genomic perturbation screens: gold standards for human functional genomics.” Molecular Systems Biology 10 (7). EMBO: 733. doi:10.15252/msb.20145216.
Hart, Traver, Amy Hin Yan Tong, Katie Chan, Jolanda Van Leeuwen, Ashwin Seetharaman, Michael Aregger, Megha Chandrashekhar, et al. 2017. “Evaluation and design of genome-wide CRISPR/SpCas9 knockout screens.” G3: Genes, Genomes, Genetics 7 (8). Genetics Society of America: 2719–27. doi:10.1534/g3.117.041277.
Peets, Elin Madli, Luca Crepaldi, Yan Zhou, Felicity Allen, Rasa Elmentaite, Guillaume Noell, Gemma Turner, Vivek Iyer, and Leopold Parts. 2019. “Minimized double guide RNA libraries enable scale-limited CRISPR/Cas9 screens.” bioRxiv, November. Cold Spring Harbor Laboratory, 859652. doi:10.1101/859652.
Strauss, ME, and L Parts. 2020. In Preparation.